6 research outputs found

    On the Effects of Data Heterogeneity on the Convergence Rates of Distributed Linear System Solvers

    Full text link
    We consider the fundamental problem of solving a large-scale system of linear equations. In particular, we consider the setting where a taskmaster intends to solve the system in a distributed/federated fashion with the help of a set of machines, who each have a subset of the equations. Although there exist several approaches for solving this problem, missing is a rigorous comparison between the convergence rates of the projection-based methods and those of the optimization-based ones. In this paper, we analyze and compare these two classes of algorithms with a particular focus on the most efficient method from each class, namely, the recently proposed Accelerated Projection-Based Consensus (APC) and the Distributed Heavy-Ball Method (D-HBM). To this end, we first propose a geometric notion of data heterogeneity called angular heterogeneity and discuss its generality. Using this notion, we bound and compare the convergence rates of the studied algorithms and capture the effects of both cross-machine and local data heterogeneity on these quantities. Our analysis results in a number of novel insights besides showing that APC is the most efficient method in realistic scenarios where there is a large data heterogeneity. Our numerical analyses validate our theoretical results.Comment: 11 pages, 5 figure

    Connectivity-Aware Semi-Decentralized Federated Learning over Time-Varying D2D Networks

    Full text link
    Semi-decentralized federated learning blends the conventional device to-server (D2S) interaction structure of federated model training with localized device-to-device (D2D) communications. We study this architecture over practical edge networks with multiple D2D clusters modeled as time-varying and directed communication graphs. Our investigation results in an algorithm that controls the fundamental trade-off between (a) the rate of convergence of the model training process towards the global optimizer, and (b) the number of D2S transmissions required for global aggregation. Specifically, in our semi-decentralized methodology, D2D consensus updates are injected into the federated averaging framework based on column-stochastic weight matrices that encapsulate the connectivity within the clusters. To arrive at our algorithm, we show how the expected optimality gap in the current global model depends on the greatest two singular values of the weighted adjacency matrices (and hence on the densities) of the D2D clusters. We then derive tight bounds on these singular values in terms of the node degrees of the D2D clusters, and we use the resulting expressions to design a threshold on the number of clients required to participate in any given global aggregation round so as to ensure a desired convergence rate. Simulations performed on real-world datasets reveal that our connectivity-aware algorithm reduces the total communication cost required to reach a target accuracy significantly compared with baselines depending on the connectivity structure and the learning task.Comment: 10 pages, 5 figures. This paper has been accepted to ACM-MobiHoc 202

    Mathematical Tools and Convergence Results for Dynamics over Networks

    No full text
    Mathematical models of networked dynamical systems are ubiquitous - they are used to study power grids, networks of webpages, robotic and sensor networks, and social networks, to name a few. Importantly, most real-world networks are time-varying and are affected by stochastic phenomena such as adversarial attacks and communication link failures. Time-varying networks, therefore, have been under study for a few decades. However, our current understanding of the dynamical processes evolving over such networks is limited. This observation motivates the two-pronged objective of this dissertation: (i) to use theoretical and empirical methods to analyze certain networked dynamical systems that cannot be studied using standard tools and techniques, and (ii) to develop suitable mathematical techniques for the systematic study of such systems.As our main contribution resulting from (i), we use the properties of random time-varying networks to provide a rigorous theoretical foundation for the age-structured Susceptible-Infected-Recovered model, a model of epidemic spreading. We then use system identification to show that the age-structured SIR dynamics accurately model the spread of COVID-19 at city and state levels in two different parts of the world – Tokyo and California.As for our contributions resulting from (ii), we extend two assertions of the Perron-Frobenius theorem to time-varying networks described by strongly aperiodic stochastic chains, thereby widening the applicability of the fundamental result that is foundational to probability theory and to the studies of complex networks, population dynamics, internet search engines, etc. Our results enable us to extend several known results on distributed learning and averaging. Moreover, they promise to advance our understanding of dynamical processes over real-world networks.As an application of these results, we study non-Bayesian social learning on random time-varying networks that violate the standard connectivity criterion of uniform strong connectivity. In doing so, we also make a methodological contribution: we show how the theory of absolute probability sequences and martingale theory can be combined to analyze nonlinear dynamics that approximate linear dynamics asymptotically in time.Finally, we study the convergence properties of social Hegselmann-Krause dynamics (which is a variant of the classical Hegselmann-Krause model of opinion dynamics and incorporates state-dependence into distributed averaging). As our main contribution here, we provide nearly tight necessary and sufficient conditions for a given connectivity graph to exhibit unbounded epsilon-convergence times for such dynamics
    corecore